A Determining Approach for Customer Behavior Analysis Using K-Mean Clustering
Rajeshri Lanjewar and Tripti Sharma
Chhatrapati Shivaji Institute of Technology, Durg. CG
*Corresponding Author E-mail: rajeshri21@csitdurg.in
ABSTRACT:
In any industry, the first step to finding and creating profitable customers is determining what drives profitability. This leads to better prospecting and more successful customer relationship management. Any company can segment and profile their customer base to uncover those profit drivers using the knowledge of their customers, products, and markets. Or they can use data-driven techniques to find natural clusters in their customer or prospect base. Whatever the method, the process will lead to knowledge and understanding that is critical to maintaining a competitive edge. Analyzing consumer behavior is a costly implementation of sophisticated information technology, which requires detailed planning and business knowledge for successful adoption. The current trend on consumer behavior analysis has been recognized on the business problem rather than on the information technology.
KEYWORDS:
I. INTRODUCTION:
Many enterprises have gathered significant numbers of large databases. The database marketing technique uses modern data analysis methods to acquire new customers and apply to develop new business strategies and opportunities. Unlike most data summaries are usually a summary of the data, data mining involves the automated analysis of data to produce useful knowledge in a highly summarized form2. Data mining thus is very useful in market segmentation, customer profiling, risk analysis, and other applications. Data mining can also produce rules and models that are useful in replicating or generalizing decisions that can be applied to determine marketing strategies. Economic theory has established that there are a large number of customers with a small income and a small number of customers with a large income. However, instead of targeting all prospects equally or providing the same incentive offers to all customers,
What is the importance of understanding customers?
Studies show that many companies operate for years— pumping out offers for products and services— without a clue of what their best customer looks like. For every company in every industry, this is the most important first step to profitable marketing. Similar to modeling, before a company begins any profiling or segmentation project, it is important to establish their objective. This is crucial because it will affect the way in which to approach the task. The objective can be explained by reviewing the definitions of profiling and segmentation3.
Profiling is exactly what it implies: the act of using data to describe or profile a group of customers or prospects. It can be performed on an entire database or distinct sections of the database. The distinct sections are known as segments. Typically they are mutually exclusive, which means no one can be a member of more than one segment.
Segmentation is the act of splitting a database into distinct sections or segments. There are two basic approaches to segmentation: market driven and data driven. Market-driven approaches allow managers to use characteristics that they determine to be important drivers of their business. In other words, they preselect the characteristics that define the segments. This is why defining the objective is so critical. The ultimate plans for using the segments will determine the best method for creating them. On the other hand, data-driven approaches use techniques such as cluster analysis or factor analysis to find homogenous groups. This might be useful if companies are working with data about which they have little knowledge1.
Figure 1. The two-stage framework of consumer behavior analysis.
Enterprises can select only those customers who meet certain profitability criteria based on their individual needs or consumer behaviors13. Therefore, assuming that consumer behavior follows a similar pattern seems reasonable. For a bank, most existing data mining approaches were discovered rules1,27 and predicted personal bankruptcy12,10, 30 in a bank database. Hadden et al.17 review literature for the development of a customer churn management platform. As shown in figure 1, a two-stage framework of consumer behavior analysis was established to predict profitable customer based on demographic characteristics and previous consumer behavior22.
II RELATED WORKS:
Customer segmentation based on customer value Customer value has been studied under the name of LTV (Life Time Value), CLV (Customer Lifetime Value), CE (Customer Equity) and Customer Profitability4. The previous researches define LTV as the sum of the revenues gained from company’s customers over the lifetime of transactions after the deduction of the total cost of attracting, selling, and servicing customers, taking into account the time value of money (Dwyer, 1997; Hoekstra and Huizingh, 1999; Jain and Singh, 2002)5.
Customer segmentation methods using LTV can be classified into three categories: (1) segmentation by using only LTV values, (2) segmentation by using LTV components and (3) segmentation by considering both LTV values and other information. In the first method, the list of customers’ LTV is sorted in descending order. The list is divided by its percentile. In this case, we segment customer list by only LTV, however, other information like socio-demographic information or transaction analysis may be used together for a better marketing practice. For instance, after segmenting a highly profitable customer group, a firm may recommend popular products to the targeted group at a discounted price. Fig. 2 briefly depicts the concept of segmentation using only LTV6. The second method performs segmentation by considering components used in LTV calculation. Hwang, Jung, and Such (2004) considered three factors: current value, potential value, and customer loyalty to calculate LTV and present the method to segment the three factors for customer segmentation. Fig. 3shows segmentation using factors in calculating LTV. The last method is to segment the customer list with LTV value and other managerial information. In this case, LTV is an axis of the segment in n-dimensional segment space and other information, such as socio-demographic information and transaction history become another axis[7]. This approach is more meaningful for segmenting the customer list than the first method. Fig. 4 shows a segmented customer list with LTV value and other managerial information.
Fig. 2. Customer segmentation using LTV.
Fig. 3. Customer segmentation using LTV components.
Fig. 4. Customer segmentation based on LTV and other information.
Most discussions on the marketing literature and textbooks describe behavioral segmentation in terms of usage volume such as heavy users, medium users, and light users (Kotler, 1997) or brand-buying behavior such as brand loyals, other brand loyals, and brand switchers (Rossiter and Percy, 1997). Customer profitability can serve as another important basis for behavioral segmentation because of the central importance of profits (Mulhern, 1999). Several segments may be formed by using customer profitability8. For instance, themost profitable segment consisting of the highest-profit customers should be retained through loyalty and retention program. Another possible segment is the most unprofitable customer group who generatemore costs than profit. This segment is arguable since unprofitable customers seem to have no worthy of marketing efforts9. Verhoef and Donkers (2001) used two dimensions, current value and potential value, to segment the customers of an insurance company. In this study, we use three dimension, current value, potential value and customer loyalty, to consider the customer defection26. The current value becomes a measure of customers’ past profitability, potential value becomes a measure of the possibilities of additional sales and the customer loyalty can be a measure of customer retention. After calculating three customer values, we perform customer segmentation by using the values25.
III. AN EXISTING FRAMEWORK FOR BUILDING MANAGERIAL STRATEGIES BASED ON CUSTOMER VALUE:
A framework for building managerial strategies based on customer value is organized into three phases. Phase I explains the preparation steps to be conducted before defining the customer value and setting up marketing strategies. In phase II, we evaluate the customer value from three viewpoints— current value, potential value and customer loyalty. After segmenting the customer base with three viewpoints, a segment analysis is performed according to the segmentation results. Phase III analyzes the characteristics of each segment according to current value, the potential value, and the customer loyalty and this part presents the procedure of building strategies based on these three customer values10
Segmentation based on customer value: The raw data of this study consists of 6-month service data of a wireless communication company in Korea29. The data can be categorized roughly into two types, socio-demographic information and usage information of wireless service. This dataset is composed of 200 data fields and 16,384 records of customers. 101 data fields were left to work with after unessential data fields were eliminated11. The mean value for continuous values and the mode value for class variables substituted for missing values28. In addition, we divided the entire dataset at the ratio of 70-to-30, training set and validation set, respectively. We used the same method of calculating customer values—current value, potential value, and customer loyalty—suggested by the previous study (Hwang, Jung, and Suh, 2004).
We calculate the current value as the average amount of service charge asked to pay for a customer, minus the average charge in arrears for a customer, regarding 6 months for calculation27.
Current Value = (Average amount asked to pay for a customer - Cumulative amount in arrears for the Customer/total period of use)24.
Calculating potential value: As mentioned before, it is important to consider cross selling and up-selling as well to calculate customer value (Kim and Kim, 1999). We define here potential value of customers as expected profits that can be obtained from a certain customer when a customer uses the additional services of a wireless communication company. The following is the equation to evaluate potential values23.
Probij is the probability that customer i will use the service j among n-optional services. Profitij means the profit that a company can receive from the customer i who uses the optional service j. In other words, the equation above means expected profits from a particular customer who uses optional services provided by a wireless communication company30. The expected profits will become potential value we need to evaluate. Profitij means the expected value when a company provides a customer with a certain optional service. We calculated it by subtracting the cost of each optional service from the charge of each optional service. The charge and cost of each optional service is given by the telecommunication company. Potential values can represent a measure of additional sales opportunity. It can be used to recommend optional services to customers12
Customer loyalty: Customer loyalty can be defined as the index that customers would like to remain as customers of a company. Customer LoyaltyZ1KChurn rate Churn describes the number or percentage of regular customers who abandon a relationship with a service provider. Customer loyalty can be a measure of customer retention13. The previous studies on customer value have not treated the churn rate yet, limiting themselves to predict the future profit change of customers with the past profit history. The effective evaluation of customer value, however, should comprehend the leaving probability of each customer. Fig. 5 shows the procedure of calculating an invidual churn rate14. Therefore, this paper measures the leaving probability for each customer to calculate the churn rate, using data mining techniques. Like the process to calculate the Probij, we take several models (decision tree, neural network and logistic regression) and then select an optimal model among them based on the result of a comparative test with the Misclassification rate or the lift chart method.
IV PROPOSED WORK:
Input – number of clusters k and data set D containing n objects. Output – A set of k clusters
1) From D, randomly generate k points as the initial cluster centres.
2) Assign each object to a cluster to which the object is the most similar, based on the cluster mean value and the object value.3) Re-compute mean of each cluster from the objects in it and update the cluster means.
4) Repeat steps 2 and 3 till there is no change in clusters
V. CONCLUSION:
A successful profiling and segmentation process demands that a company should define its business objectives18. At the start of any segmentation process, management should agree on and clearly state their goals using language that reflects targeting and measurement. Business objectives can be (1) new account, sales, or usage driven; (2) new product driven; (3) profitability driven; or (4) product or service positioning driven. Furthermore types of data could include survey, geo-demographic overlays, and transactional behavior. Data must be relevant to the business objectives20. The process involves reviewing all data to determine only the necessary elements because collecting and analyzing data on all customers or prospects is very time-consuming and expensive. The segmentation process means selecting a method that is appropriate for the situation. There are three segmentation methods that could be employed: predefined segmentation, statistical segmentation, or hybrid segmentation21. The predefined segmentation method allows the analyst to create the segment definitions based on prior experience and analysis. In this case, the data is known, the work involves a limited number of variables, and a limited number of segments are determined. The appropriate segments will be defined and selected based on the business objective and the knowledge of the customer base22.
VI. REFERENCES:
1. Au, W. H. and Chan, K. C. C., Mining fuzzy association rules in a bank-account database, IEEE Transactions on Fuzzy Systems, Vol. 11, 2003.
2. Baesens, B., Viaene, S., Poel, D., Vanthienen, J. and Dedene, G., Bayesian neural network for repeat purchase modelling in direct marketing, European Journal of Operational Research, Vol. 138, pp.191-211, 2002.
3. Balakrishnan, P. V. S., Cooper, M. C., Jacob, V. S. and Lewis, P. A., Comparative performance of the FSCL neural net and K-means algorithm for market segmentation, European Journal of Operational Research, Vol. 93, pp.346-357, 1996.
4. Berry, M. J. A. and Linoff, G. S., Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, 2nd Edition, John Wiley and Sons, 2004.
5. Cai, Y., Gercone, N. and Han, J., An attribute-oriented approach for learning classification rulesfrom relational databases, Proceedings of Conference on Data Engineering, pp.281-288, 1990.
6. Cendrowska, J., PRISM: an algorithm for inducing modular rules, International Journal of Man- Machine Studies, Vol. 27, pp.349-370, 1988.
7. Chan, C.-C. H., Intelligent spider for information retrieval to support mining-based price predictionfor online auctioning, Expert Systems with Applications, Vol. 34, pp.347-356, 2008.
8. Chan, C.-C. H., Online auction customer segmentation using a neural network model, InternationalJournal of Applied Science and Engineering, Vol. 3, pp.101-109, 2005.
9. Chiang, D. A., Chen, W., Wang, Y. F. and Hwang, L. J., Rules generation from the decision tree, Journal of Information Science and Engineering, Vol. 17, pp.325-339, 2001.
10. Dasgupta, C. G., Dispensa, G. S. and Ghose, S., Comparing the predictive performance of a neural network model with some traditional market response models, International Journal of Forecasting, Vol. 10, pp.235-244, 1994.
11. Davies, F., Moutinho, L. and Curry, B., ATM user attitudes: a neural network analysis, Marketing Intelligence and Planning, Vol. 14, pp.26-32, 1996.
12. Donato, J. C., Schryver, G. C., Hinkel, R. L., Schmoyer, J., Leuze, M. R. and Grandy, N. W., Mining multi-dimensional data for decision support, Future Generation Computer Systems, Vol. 15, pp.433-441, 1999.
13. Dyche, J. and Dych, J., The CRM Handbook: A Business Guide to Customer Relationship Manage-ment, Addison-Wesley Pub Co., August 2001.
14. Fayyad, U. M., Branching on attribute values in decision tree generalization, Proceedings of 20th National Conference on Artificial Intelligence, AAAI-94, pp.104-110, 1994.
15. Feelders, A. J., Credit scoring and reject inference with mixture models, International Journal of Intelligent Systems in Accounting, Finance and Management, Vol. 9, pp.1-8, 2000.
16. Fish, K. E., Barnes, J. H. and Aiken, M. W., Artificial neural networks - A new methodology for industrial market segmentation, Industrial Marketing Management, Vol. 24, pp.431-438, 1995.
17. Hadden, J., Tiwari, A., Roy, R. and Ruta, D., Computer assisted customer churn management: State-of-the-art and future trends, Computers and Operations Research, Vol. 34, pp.2902-2917, 2005.
18. Hornik, K., Stinchcombe, M. and White, H., Multilayer feedforward networks are universal approxi-mations, Neural Networks, Vol. 2, pp.336-359, 1989.
19. Hsieh, N.-C., Hybrid mining approach in the design of credit scoring models, Expert Systems with Applications, Vol. 28, pp.655-665, 2005.
20. Kim, Y. S. and Sohn, S. Y., Managing loan customers using misclassification patterns of credit scoring model, Expert Systems with Applications, Vol. 26, pp.567-573, 2004.
21. Kohonen, T., Self-organizing maps, Springer, Berlin, 1995.
22. Kuo, R. J., An, Y. L., Wang, H. S. and Chung, W. J., Integration of Self-organizing Feature Maps Neural Network and Genetic K-Means Algorithm for Market Segmentation, Expert Systems with Applications, Vol. 30, pp.313-324, 2006.
23. Lancher, R. C., Coats, P. K., Shanker, C. S. and Fant, L. F., A neural network for classifying the financial health of a firm, European Journal of Operational Research, Vol. 85, pp.53-65, 1995.
24. Lee, T. S., Chiu, C. C., Lu, C. J. and Chen, I. F., Credit scoring using the hybrid neural discriminate technique, Expert Systems with Applications, Vol. 23, pp.245-254, 2002.
25. Malhotra, R. and Malhotra, D. K., Evaluating consumer loans using neural networks, Omega, Vol. 31, pp.83-96, 2003.
Received on 23.11.2011 Accepted on 28.12.2011
© EnggResearch.net All Right Reserved
Int. J. Tech. 1(2): July-Dec. 2011; Page 125-129